11 research outputs found

    Deep Learning Methods for Classification of Glioma and its Molecular Subtypes

    Get PDF
    Diagnosis and timely treatment play an important role in\ua0preventing brain tumor growth. Clinicians are unable to reliablypredict LGG molecular subtypes from magnetic resonance imaging (MRI) without taking biopsy. Accurate diagnosis prior to surgery would be important. Recently, non-invasive classification methods such as deep learning have shown promising outcome in prediction of glioma-subtypes based upon pre-operative brain scans. However, it needs large amount of annotated medical data on tumors. This thesis investigates methods on the problem of data scarcity, specifically for molecular LGG-subtypes. The focus of this thesis is on two challenges for improving the classification performance of gliomas and its molecular subtypes using MRIs; data augmentation and domain mapping to overcome the lack of data and using data with unavailable GT annotation to tackle the issue of tedious task of manually marking tumor boundaries. Data augmentation includes generating synthetic MR images to enlarge the training data using Generative Adversarial Networks (GANs). Another type of GAN, CycleGAN, is used to enlarge the data size by mapping data from different domains to a target domain. A multi-stream Convolutional Autoencoder (CAE) classifier is proposed with a 2-stage training strategy. To enable MRI data to be used without tumor annotation, ellipse bounding box is proposed that gives comparable classification performance. The thesis comprises of papers addressing the challenging problems of data scarcity and lacking of tumor annotation. These proposed methods can benefit the future research in bringing machine learning tools into clinical practice for non-invasive diagnostics that would assist surgeons and patients in the shared decision making process

    Deep Learning Methods for Classification of Gliomas and Their Molecular Subtypes, From Central Learning to Federated Learning

    Get PDF
    The most common type of brain cancer in adults are gliomas. Under the updated 2016 World Health Organization (WHO) tumor classification in central nervous system (CNS), identification of molecular subtypes of gliomas is important. For low grade gliomas (LGGs), prediction of molecular subtypes by observing magnetic resonance imaging (MRI) scans might be difficult without taking biopsy. With the development of machine learning (ML) methods such as deep learning (DL), molecular based classification methods have shown promising results from MRI scans that may assist clinicians for prognosis and deciding on a treatment strategy. However, DL requires large amount of training datasets with tumor class labels and tumor boundary annotations. Manual annotation of tumor boundary is a time consuming and expensive process.The thesis is based on the work developed in five papers on gliomas and their molecular subtypes. We propose novel methods that provide improved performance. \ua0The proposed methods consist of a multi-stream convolutional autoencoder (CAE)-based classifier, a deep convolutional generative adversarial network (DCGAN) to enlarge the training dataset, a CycleGAN to handle domain shift, a novel federated learning (FL) scheme to allow local client-based training with dataset protection, and employing bounding boxes to MRIs when tumor boundary annotations are not available.Experimental results showed that DCGAN generated MRIs have enlarged the original training dataset size and have improved the classification performance on test sets. CycleGAN showed good domain adaptation on multiple source datasets and improved the classification performance. The proposed FL scheme showed a slightly degraded performance as compare to that of central learning (CL) approach while protecting dataset privacy. Using tumor bounding boxes showed to be an alternative approach to tumor boundary annotation for tumor classification and segmentation, with a trade-off between a slight decrease in performance and saving time in manual marking by clinicians. The proposed methods may benefit the future research in bringing DL tools into clinical practice for assisting tumor diagnosis and help the decision making process

    Classification-based tests for neuroimaging data analysis: comparison of best practices

    Get PDF
    In neuroimaging data analysis, classification algorithms are frequently used to discriminate between two populations of interest, like patients and healthy controls, or between stimuli presented to the subject, like face and house. Usually, the ability of the classifier to discriminate populations is used within a statistical test, in order to evaluate scientific hypotheses. In the literature, different procedures are adopted to carry out such tests, like using permutations, assuming the binomial model or using confidence intervals. Moreover multiple choices are made by practitioners when implementing those tests, like the actual classification algorithm or the use of a resampling scheme. In this work we analyze those procedures and some of those choices with respect to their effect on the Type I (false discovery) and Type II (sensitivity) errors. With a simulation study, we compare the different procedures and show the impact in practice. The final aim is to characterize the best practices and give more insight for their use

    A novel federated deep learning scheme for glioma and its subtype classification

    Get PDF
    Background:\ua0Deep learning (DL) has shown promising results in molecular-based classification of glioma subtypes from MR images. DL requires a large number of training data for achieving good generalization performance. Since brain tumor datasets are usually small in size, combination of such datasets from different hospitals are needed. Data privacy issue from hospitals often poses a constraint on such a practice. Federated learning (FL) has gained much attention lately as it trains a central DL model without requiring data sharing from different hospitals.Method:\ua0We propose a novel 3D FL scheme for glioma and its molecular subtype classification. In the scheme, a slice-based DL classifier, EtFedDyn, is exploited which is an extension of FedDyn, with the key differences on using focal loss cost function to tackle severe class imbalances in the datasets, and on multi-stream network to exploit MRIs in different modalities. By combining EtFedDyn with domain mapping as the pre-processing and 3D scan-based post-processing, the proposed scheme makes 3D brain scan-based classification on datasets from different dataset owners. To examine whether the FL scheme could replace the central learning (CL) one, we then compare the classification performance between the proposed FL and the corresponding CL schemes. Furthermore, detailed empirical-based analysis were also conducted to exam the effect of using domain mapping, 3D scan-based post-processing, different cost functions and different FL schemes.Results:\ua0Experiments were done on two case studies: classification of glioma subtypes (IDH mutation and wild-type on TCGA and US datasets in case A) and glioma grades (high/low grade glioma HGG and LGG on MICCAI dataset in case B). The proposed FL scheme has obtained good performance on the test sets (85.46%, 75.56%) for IDH subtypes and (89.28%, 90.72%) for glioma LGG/HGG all averaged on five runs. Comparing with the corresponding CL scheme, the drop in test accuracy from the proposed FL scheme is small (−1.17%, −0.83%), indicating its good potential to replace the CL scheme. Furthermore, the empirically tests have shown that an increased classification test accuracy by applying: domain mapping (0.4%, 1.85%) in case A; focal loss function (1.66%, 3.25%) in case A and (1.19%, 1.85%) in case B; 3D post-processing (2.11%, 2.23%) in case A and (1.81%, 2.39%) in case B and EtFedDyn over FedAvg classifier (1.05%, 1.55%) in case A and (1.23%, 1.81%) in case B with fast convergence, which all contributed to the improvement of overall performance in the proposed FL scheme.Conclusion:\ua0The proposed FL scheme is shown to be effective in predicting glioma and its subtypes by using MR images from test sets, with great potential of replacing the conventional CL approaches for training deep networks. This could help hospitals to maintain their data privacy, while using a federated trained classifier with nearly similar performance as that from a centrally trained one. Further detailed experiments have shown that different parts in the proposed 3D FL scheme, such as domain mapping (make datasets more uniform) and post-processing (scan-based classification), are essential

    A Feasibility Study on Deep Learning Based Brain Tumor Segmentation Using 2D Ellipse Box Areas

    Get PDF
    In most deep learning-based brain tumor segmentation methods, training the deep network requires annotated tumor areas. However, accurate tumor annotation puts high demands on medical personnel. The aim of this study is to train a deep network for segmentation by using ellipse box areas surrounding the tumors. In the proposed method, the deep network is trained by using a large number of unannotated tumor images with foreground (FG) and background (BG) ellipse box areas surrounding the tumor and background, and a small number of patients (<20) with annotated tumors. The training is conducted by initial training on two ellipse boxes on unannotated MRIs, followed by refined training on a small number of annotated MRIs. We use a multi-stream U-Net for conducting our experiments, which is an extension of the conventional U-Net. This enables the use of complementary information from multi-modality (e.g., T1, T1ce, T2, and FLAIR) MRIs. To test the feasibility of the proposed approach, experiments and evaluation were conducted on two datasets for glioma segmentation. Segmentation performance on the test sets is then compared with those used on the same network but trained entirely by annotated MRIs. Our experiments show that the proposed method has obtained good tumor segmentation results on the test sets, wherein the dice score on tumor areas is (0.8407, 0.9104), and segmentation accuracy on tumor areas is (83.88%, 88.47%) for the MICCAI BraTS’17 and US datasets, respectively. Comparing the segmented results by using the network trained by all annotated tumors, the drop in the segmentation performance from the proposed approach is (0.0594, 0.0159) in the dice score, and (8.78%, 2.61%) in segmented tumor accuracy for MICCAI and US test sets, which is relatively small. Our case studies have demonstrated that training the network for segmentation by using ellipse box areas in place of all annotated tumors is feasible, and can be considered as an alternative, which is a trade-off between saving medical experts’ time annotating tumors and a small drop in segmentation performance

    Prediction of glioma‑subtypes: comparison of performance on a DL classifier using bounding box areas versus annotated tumors

    Get PDF
    Background: For brain tumors, identifying the molecular subtypes from magnetic resonance imaging (MRI) isdesirable, but remains a challenging task. Recent machine learning and deep learning (DL) approaches may help theclassification/prediction of tumor subtypes through MRIs. However, most of these methods require annotated datawith ground truth (GT) tumor areas manually drawn by medical experts. The manual annotation is a time consumingprocess with high demand on medical personnel. As an alternative automatic segmentation is often used. However, itdoes not guarantee the quality and could lead to improper or failed segmented boundaries due to differences in MRIacquisition parameters across imaging centers, as segmentation is an ill‑defined problem. Analogous to visual objecttracking and classification, this paper shifts the paradigm by training a classifier using tumor bounding box areas inMR images. The aim of our study is to see whether it is possible to replace GT tumor areas by tumor bounding boxareas (e.g. ellipse shaped boxes) for classification without a significant drop in performance.Method: In patients with diffuse gliomas, training a deep learning classifier for subtype prediction by employ‑ing tumor regions of interest (ROIs) using ellipse bounding box versus manual annotated data. Experiments wereconducted on two datasets (US and TCGA) consisting of multi‑modality MRI scans where the US dataset containedpatients with diffuse low‑grade gliomas (dLGG) exclusively.Results: Prediction rates were obtained on 2 test datasets: 69.86% for 1p/19q codeletion status on US dataset and79.50% for IDH mutation/wild‑type on TCGA dataset. Comparisons with that of using annotated GT tumor data fortraining showed an average of 3.0% degradation (2.92% for 1p/19q codeletion status and 3.23% for IDH genotype).Conclusion: Using tumor ROIs, i.e., ellipse bounding box tumor areas to replace annotated GT tumor areas for train‑ing a deep learning scheme, cause only a modest decline in performance in terms of subtype prediction. With moredata that can be made available, this may be a reasonable trade‑off where decline in performance may be counter‑acted with more data

    Domain Mapping and Deep Learning from Multiple MRI Clinical Datasets for Prediction of Molecular Subtypes in Low Grade Gliomas

    Get PDF
    Brain tumors, such as low grade gliomas (LGG), are molecularly classified which require the surgical collection of tissue samples. The pre-surgical or non-operative identification of LGG molecular type could improve patient counseling and treatment decisions. However, radiographic approaches to LGG molecular classification are currently lacking, as clinicians are unable to reliably predict LGG molecular type using magnetic resonance imaging (MRI) studies. Machine learning approaches may improve the prediction of LGG molecular classification through MRI, however, the development of these techniques requires large annotated data sets. Merging clinical data from different hospitals to increase case numbers is needed, but the use of different scanners and settings can affect the results and simply combining them into a large dataset often have a significant negative impact on performance. This calls for efficient domain adaption methods. Despite some previous studies on domain adaptations, mapping MR images from different datasets to a common domain without affecting subtitle molecular-biomarker information has not been reported yet. In this paper, we propose an effective domain adaptation method based on Cycle Generative Adversarial Network (CycleGAN). The dataset is further enlarged by augmenting more MRIs using another GAN approach. Further, to tackle the issue of brain tumor segmentation that requires time and anatomical expertise to put exact boundary around the tumor, we have used a tight bounding box as a strategy. Finally, an efficient deep feature learning method, multi-stream convolutional autoencoder (CAE) and feature fusion, is proposed for the prediction of molecular subtypes (1p/19q-codeletion and IDH mutation). The experiments were conducted on a total of 161 patients consisting of FLAIR and T1 weighted with contrast enhanced (T1ce) MRIs from two different institutions in the USA and France. The proposed scheme is shown to achieve the test accuracy of\ua074.81%\ua0on 1p/19q codeletion and\ua081.19%\ua0on IDH mutation, with marked improvement over the results obtained without domain mapping. This approach is also shown to have comparable performance to several state-of-the-art methods

    Derin sinir ağlarında oto-kodlayıcının düzenlenmesi için terkinim ve seyreklik kullanımı

    No full text
    Cataloged from PDF version of thesis.Includes bibliographical references (leaves 71-74).Thesis (M.S.): Bilkent University, Department of Electrical and Electronics Engineering, İhsan Doğramacı Bilkent University, 2015.Deep learning has emerged as an e ective pre-training technique for neural networks with many hidden layers. To overcome the over- tting issue, usually large capacity models are used. In this thesis, two methodologies which are frequently utilized in deep neural network literature have been considered. Firstly, for pretraining the performance of sparse autoencoder has been improved by adding p-norm of the sparse penalty term to an over-complete case. This e ciently induces sparsity to the hidden layers of a deep network to overcome over- tting issues. At the end of the training, features constructed for each layer end up with a variety of useful information to initialize a deep network. The accuracy obtained is comparable to the conventional sparse autoencoder technique. Secondly, the large capacity networks su er from complex co-adaptations between the hidden layers by combining the predictions of each unit in the previous layer to generate the features of the next layer. This results to certain redundant features. So, the idea we propose is to induce a threshold level on the hidden activations to allow only the highest active units to participate in the reconstruction of the features and suppressing the e ect of less active units in the optimization. This is implemented by dropping out k-lowest hidden units while retaining the rest. Our simulations con rm the hypothesis that the k-lowest dropouts help the optimization in both the pre-training and ne-tuning phases giving rise to the internal distributed representations for better generalization. Moreover, this model gives quick convergence than the conventional dropout method. In classi cation task on MNIST dataset, the proposed idea gives the comparable results with the previous regularization techniques such as denoising autoencoders, use of recti er linear units combined with standard regularizations. The deep networks constructed from the combination of our models achieve favorably the similar state of the art results obtained by dropout idea with less time complexity making them well suited to large problem sizes.by Muhaddisa Barat AliM.S

    Domain Mapping and Deep Learning from Multiple MRI Clinical Datasets for Prediction of Molecular Subtypes in Low Grade Gliomas

    No full text
    Brain tumors, such as low grade gliomas (LGG), are molecularly classified which require the surgical collection of tissue samples. The pre-surgical or non-operative identification of LGG molecular type could improve patient counseling and treatment decisions. However, radiographic approaches to LGG molecular classification are currently lacking, as clinicians are unable to reliably predict LGG molecular type using magnetic resonance imaging (MRI) studies. Machine learning approaches may improve the prediction of LGG molecular classification through MRI, however, the development of these techniques requires large annotated data sets. Merging clinical data from different hospitals to increase case numbers is needed, but the use of different scanners and settings can affect the results and simply combining them into a large dataset often have a significant negative impact on performance. This calls for efficient domain adaption methods. Despite some previous studies on domain adaptations, mapping MR images from different datasets to a common domain without affecting subtitle molecular-biomarker information has not been reported yet. In this paper, we propose an effective domain adaptation method based on Cycle Generative Adversarial Network (CycleGAN). The dataset is further enlarged by augmenting more MRIs using another GAN approach. Further, to tackle the issue of brain tumor segmentation that requires time and anatomical expertise to put exact boundary around the tumor, we have used a tight bounding box as a strategy. Finally, an efficient deep feature learning method, multi-stream convolutional autoencoder (CAE) and feature fusion, is proposed for the prediction of molecular subtypes (1p/19q-codeletion and IDH mutation). The experiments were conducted on a total of 161 patients consisting of FLAIR and T1 weighted with contrast enhanced (T1ce) MRIs from two different institutions in the USA and France. The proposed scheme is shown to achieve the test accuracy of 74 . 81 % on 1p/19q codeletion and 81 . 19 % on IDH mutation, with marked improvement over the results obtained without domain mapping. This approach is also shown to have comparable performance to several state-of-the-art methods
    corecore